home *** CD-ROM | disk | FTP | other *** search
- This is only a rough draft - Megan 04/16/92
-
- WAIS-W3-X.500 BOF MINUTES
-
-
- BOF at the March 1992 IETF[1] , on the evening of March 18.
-
-
- Summary
-
- This meeting followed discussion at the "living documents" BOF[2]
- the previous evening, and was more focussed in its discussion.
-
-
- The WAIS, World-Wide Web, Prospero systems for network information
- retrieval (NIR) were presented (the Gopher protocol was presented
- in plenary the following day). The x500 directory was presented
- in the light of NIR needs, as were two proposals to use the
- directory to refer to documents. A discussion followed as to how
- to allow these systems to inter-operate, and on requirements for
- name spaces. A working group was proposed to define the format for
- a generalized printable format for a name or address in any of
- these systems.
-
-
- Chair Steve Kille, UCL and ISODE consortium
-
-
- Present See list ietf-wwx-bof@info.cern.ch[3] .
-
-
- These minutes are available in hypertext form using WWW as
- http://info.cern.ch./hypertext/Conferences/IETF92/WWX_BOF_
- mins.html as well as through the normal channels.
-
-
- WAIS
-
- John Curran of BBN presented the WAIS protocol, in the absence of
- anyone from Thinking Machines Corporation who were originally
- responsible for it. The WAIS model is of a number of servers,
- each of which serves a number of databases, each of which contains
- a number of documents. Client software allows many databases to
- be searched at the same time. The server keeps an inverted full
- text index for each database, so the search is very fast.
- Non-text files may also be served: recent extensions allow
- indexing of text files in new formats. The files indexed need not
- be copied, but the index is of the same order of size as the
- files.
-
-
- Many databases exist, but there is no scalable way of finding them
- (TMC currently keeps a master index). Use of x500 was discussed.
-
-
- The WAIS protocol is an extended subset of Z3950. The differences
- were discussed: WAIS allows relevance feedback ("Give me a
- document like this one") , and specifies how a query should be
- formulated. WAIS and Z39.50 have the same presentation layer.
-
-
- Documents in the Directory
-
- Wengiyk Yeongpresented his paper OSI-DS-22, "Representing public
- archives in the directory"[4]. His project puts information about
- documents, including the network address for retrieval, into the
- directory. He currently has RFCs and FYI documents in, but would
- like to move on to other internet archives. He concluded that he
- needed a more sophisticated approach. It was difficult to
- characterize arbitrary archives, with too little information about
- them. (See IAFA WG[5]).
-
-
- The World-Wide Web
-
- Tim Berners-Lee presented the World Wide Web (w3) and discussed
- requirements for interworking between the systems. The W3 project
- was initially funded to provide an information infrastructure to
- the world-wide community of high energy physicists. The data
- model is of documents which are hypertext and/or searchable
- indexes. The philosophy behind it is that a user should be able
- to point and click on phrase or a word within a document and the
- associated document would be retrieved from wherever in the world
- and presented to the user in an appropriate format - without the
- user having to be aware of where the document is located or what
- the access method is. These details are hidden in the hypertext
- links. There were server programs for many information servers,
- gateways to WAIS, Archie and gopher and client programs for
- various user machines.
-
-
- The W3 clients use several protocols for accessing documents (FTP,
- NNTP, WAIS, Gopher, and W3's own "HTTP") although this is hidden
- from the user. The HTTP protocol is a simple stateless
- search/retrieve protocol running over TCP. As originally
- conceived but not yet implemented, it included authentication and
- data format negotiation.Tim discussed the differences between WWW,
- WAIS, Archie, Gopher and Prospero systems.
-
-
- The need for a Universal Document Identifier (UDI) for describing
- the address or, given a directory, name, for a document whatever
- is access protocol was discussed, as outlined in OSI-DS-XX. Each
- application uses a "handle" for a file which can be prefixed by
- the particular protocol name to generate a universal address.
-
-
- Most systems (WAIS excepted) are extensible, entertaining document
- addresses which refer to other systems. WAIS indexes currently
- can only refer to documents in the same database, let alone with
- other retrieval methods. There is a need for WAIS to be more
- flexible. John Curran said he would bring this to the attention of
- the WAIS community.
-
-
- Addresses would not in the long term be suitable for references to
- documents, so it was hoped that some sort of directory service,
- operating within the UDI framework, would be incorporated.
-
-
- More information: telnet info.cern.ch. Client and server code
- is available by anonymous FTP from info.cern.ch.
-
-
- Mailing lists: www-talk@info.cern.ch, www-interest@info.cern.ch
-
-
- Discussion document: OSI-DS-29[6]
-
-
- Representing the Real World in the Directory
-
- Paper: OSI-DS-25[7]Steve Kille discussed this paper "Representing
- the Real World in an X.500 Directory".
-
-
- A Listing Service may be used to group like information items
- together for example to provide a Yellow Pages Service.
-
-
- Such a service could for example provide for members of a special
- interest group, or could group documents on a particular
- subject.Services such as Archie could be considered to be Listing
- Services. One imagines an information Universe in which
- Information Brokers provide different subject based (say) views
- via their listing service. One would then need to locate the
- various listing services (using a mechanism such as a directory?)
-
-
- UK British Library Project
-
- Paul Barker described a project, sponsored by the British Library,
- to represent grey literature (unpublished research papers) in the
- Directory. The project is thought to be unlikely to succeed - but
- one of the aims is to demonstrate whether or not it is possible.
- They will take the (UK) MARC records and model these within X.500.
- They might also consider trying to provide a listing service so
- that the documents might be retrieved more readily by subject
- area.
-
-
- Prospero
-
- Cliff Neuman described Prospero. It follows a file system model,
- rather than the hypertext model. It is built on UDP for speed.
- It has the notion of a Directory which contains links to other
- objects (other directories or files). It returns the link to the
- information object and then automatically retrieves the file by
- another mechanism by the appropriate access method (Archie, WAIS,
- nntp, WWW - soon!, NFS, ftp etc.) It has been used very
- successfully to access the archie database.
-
-
- Cliff stated that he expected to be able to use X.500 to translate
- between the document ID and how to get the document.
-
-
- With Prospero the user has his own view of the global information
- base (or has a view built for him). Cliff thought there should be
- multiple name spaces - but the difficulty would be that these
- would need representing near the top of the directory tree. With
- multiple user chosen views - this would be difficult to manage.
- Also two users might refer to an object by different handles which
- would be relative to their individual name spaces - difficult when
- passing references (say in a mail message) from one person to the
- other.
-
-
- The concept of "Closure": Each object has a related name space.
- All references within the object are resolved using the context of
- the name space. Name spaces themselves have global network
- addresses, but the user doesn't see that.
-
-
- More information: info-prospero@isi.edu
-
-
-
- System 33
-
- Larry Masinter talked about a project at Xerox PARC. This has the
- concepts:
-
-
- HANDLE 32 byte number (is a content ID). In fact
- this contains hints for finding the
- document.
-
-
- FILE Location (6 part)
- Protocol; Host; Path; piece; format;
- timeout
-
-
- Description (normal "Catalogue" information: Name,
- Author, etc)
-
-
- There is format negotiation when a document is retrieved. It is
- not simple in reality to categorize data formats as there is such
- a plethora of different varieties.
-
-
- Gateways provide access between systems not sharing transport
- protocols.
-
-
- Also considered Access Control. ACL is part of description. The
- Server exploits multiple protocols for Search and retrieve.
-
-
- There is a problem with dealing with different types of document
- (applications for jobs, product specs, memos, contracts, faxes,
- etc. ) It is difficult to normalize the attributes of a general
- document.
-
-
- Summing up
-
- Tim Berners-Lee summed up by saying that all applications
- described used resolvable document address, and so for
- interworking, we need a universal representation for such a
- network object address. With the coming of directories, names
- should increasingly be used in place of network addresses. The
- Universal Document Identifier was intended to be able to hold
- either an name or address for any access protocol. (This is not
- the same as "USDN" a document serial number which is not
- resolvable, but only one of which exists for each document).
-
-
- In discussion, Steve Kille suggested should be a WG on details of
- UDIs and a separate one for USDN. A comment was that the W3 data
- model encompasses those of the other systems. John Curran
- insisted on a better term than "UDI", suggesting "Document Access
- Token".
-
-
- Peter Deutch's need for a USDN is to be able to determine the
- equivalence of two USDN. Chris Weider agreed to co-author a
- document on the issues. Jill Foster suggested a pilotproject to
- put UDI's in the directory for a set of documents and to have the
- gopher, Prospero, archie, and Prospero people try to utilise
- these.
-
- [These minutes have been largely built from Jill Foster's
- report[8] and Karen Sollins' notes[9] for which I am most
- grateful, though errors in the above are probably mine. Tim
- BL]
-
-
-
- References:
-
- [1]http://info.cern.ch/hypertext/Conferences/IETF92/IETF-9203.html
- [2]http://info.cern.ch/hypertext/Conferences/IETF92/LivingDocuments.h
- tml
- [3]http://info.cern.ch/hypertext/WWW/Administration/Mailing/ietf-wwx-
- bof
- [4]file://cs.ucl.ac.uk/osi-ds/osi-ds-22-00.txt
- [5]http://info.cern.ch/hypertext/Conferences/IETF92/IAFA-BOF.html
- [6]file://cs.ucl.ac.uk/osi-ds/osi-ds-29-00.txt
- [7]file://cs.ucl.ac.uk/osi-ds/osi-ds-25-00.txt
- [8]http://info.cern.ch/hypertext/Conferences/IETF92/WWX_BOF.html
- [9]http://info.cern.ch/hypertext/Conferences/IETF92/WWX_BOF_Sollins.h
- tml
-
-